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What? 


In Statistics, a distribution is simply a way to understand how а 
set of data points are spread over some given range of values. 


Set of 
data-points 


Probability 
or 
Frequency 


Some given range of values 
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Types of distribution.. 


There are many, many different types of statistical distribution - each of 
which represent different types of data, and/or serve different purposes... 


Here we will cover several commonly used distributions... 


DAN 


Normal Distribution 


Binomial Distribution 


uniform Distribution 


A 


+-Distribution 


Bernoulli Distribution 


Poisson Distribution 
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Normal Distribution | 


A normal distribution shows the probability density for a 
population of continuous data (for example height in cm for all 
NBA players) 


In other words, it shows how likely is it that any player from the 
NBA is of a certain height. Most players are around the 
mean/average height, fewer are much taller, or much shorter. 


A normal distribution is symmetrical both sides of the mean. You 
might also see this referred to as a Gaussian Distribution! 


Mean height of 
NBA Players 


Probability 
of any ‘randomly 
selected player 
being of height x 


С“ “== ==“ 
Height (x) 
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Normal Distribution 2 


The spread of the values in our population is measured using а 
metric called standard deviation. 


The Empirical Rule tells us that... 


68.3% of the values will fall between 
1 standard deviation above and 5 
below the mean 


Mean +1 


95.5% of the values will fall between 
2 standard deviations above and 
below the mean 


99.7% of the values will fall between 
3 standard deviations above and » 
below the mean 
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Normal Distribution 3 


As an example... 


Say we know that for all players in the NBA, the mean height is 200cm 
and the standard deviation is 7cm. 


If LeBron James is 206cm tall - what proportion of NBA players is he 
taller than? We can figure this out! 


LeBron is 6cm taller than the mean (206cm - 200cm). Since the standard 
deviation is /cm, he is 0.86 standard deviations (бст / 7cm) above the mean. 


Playere taller 


Players shorter 
than LeBron 


than LeBron 


Mean LeBron 
200cm 206cm 


Our value of 0.86 standard deviations is called the z-score. This can be 
converted to a percentile using the probability density function (or a lookup 
table) giving us our answer. LeBron James is taller than 80.5% of 
players in the NBA! 
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+-Distribution 


Just like anormal distribution, a t-distribution is symmetrical around the 
mean, and the breadth is based around the deviation within the data. 


While a normal distribution works with a population - а t-distribution is 
designed for situations where sample size is small. The shape of the t- 
distribution becomes broader as the sample size decreases, to take into 
account the extra uncertainty we are faced with. 


+-Distribution 


+-Distribution (i DE) 


(5 DF) 


е _————— 


Height 


The shape of a t-distribution relates to the number of degrees of freedom 
which is calculated as the sample size minus one. 


As the sample size, and thus the degrees of freedom gets larger, the t- 
distribution tends towards a normal distribution - as with a larger sample 
we're more certain around estimating the true population statistics. 
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Binomial Distribution 


A Binomial Distribution can end up looking a lot like the shape of a normal 
distribution. The main difference is that instead of plotting continuous data, 
it instead plots a distribution of two possible discrete outcomes for 
example, the results from flipping a coin. 


Imagine flipping a coin 10 times, and from those 10 flips, noting down how 
many were "Heads". It could be any number between 1 and 10. 


Now imagine repeating that task 1,000 times... 
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If the coin we are using is indeed fair (not biased to heads or tails) then the 
distribution of outcomes should start to look the plot above. In the vast 
majority of cases we get 4, 5, or 6 "heads" from each set of 10 flips, and the 
likelihood of getting more extreme results is much more rare! 
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Bernoulli Distribution 


The Bernoulli Distribution is a special case of the Binomial Distribution. It 
considers only two possible outcomes, success or failure, true or false. 


It's a really simple distribution, but worth knowing! 


In the example below we're looking at the probability of rolling a 6 with a 
standard die. 


If we roll a die many, many times, we should end up with a probability of 
rolling a 6, 1 out of every 6 times (or 16.796) and thus a probability of not 
rolling a 6, in other words rolling a 1,2,3,4 or 5, 5 times out of 6 (or 83.3%) of 
the time! 


83.3% 


Probability 
of 


outcome 


Rolled а © 
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uniform Distribution 


A Uniform Distribution is a distribution in which all events are equally 
likely to occur. 


Below, we're looking at the results from rolling a die many, many times. 
We're looking at which number we got on each roll and tallying these up. 
If we roll the die enough times (and the die is fair) we should end up with a 


completely uniform probability where the chance of getting any outcome is 
exactly the same 
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Poisson Distribution 


A Poisson Distribution is a discrete distribution similar to the Binomial 
Distribution (in that we’re plotting the probability of whole numbered 
outcomes) 


Unlike the other distributions we have seen however, this one is not 
symmetrical - it is instead bounded between 0 and infinity 


The Poisson distribution describes the number of events or outcomes that 
occur during some fixed interval. Most commonly this is a time interval like 
in our example below where we are plotting the distribution of sales per 
hour in a shop. 
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Sales per hour 


. Want to land an incredible | 
role in the exciting, future- 
proof, and lucrative field of © 
Data Science? 
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LEARN THE 
RIGHT SKILLS 


А curriculum based on 
input from hundreds of 
leaders, hiring managers, 

ana recruiters 


v 


httos://data-science-infinity.teachable.com 
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BUILD YOUR 
PORTFOLIO 


Create your professionally 
made portfolio site that 
includes 10 pre-built 
projects 


ү 


httos://data-science-infinity.teachable.com 
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EARN THE 
CERTIFICATION 


Prove your skills witn the 
DSI Data Science 
Professional Certification 


ү 


httos://data-science-infinity.teachable.com 
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LAND AN 
AMAZING ROLE 


Get guidance & support 
based upon hundreds of 
interviews at top tech 
companies 


v 


httos://data-science-infinity.teachable.com 


DATA SCIENCE 


Taught by former Amazon 
& Sony PlayStation Data 
Scientist Andrew Jones 


What do DSI 
students say? 


ү 
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"| had over 40 interviews without an offer. 
After DSI | quickly got 7 offers including 
one at KPMG ana my amazing new role 


at Deloitte!" 
- Ritesh 


“Title exse Pegel ve exse er eon or. 
nands down" 


- Christian 


‘DSI is incredible - everything is taught in 
such a clear and simple way, even the 
more complex concepts!" 


- Arianna 


"Rte exe alb Mee ОО СО TH LOO КК 
advice & help with preparation - it truly 
gave me the confidence to go in and 
land the job! 


- Marta 


[ve taken a number of Data Science 
courses, and without doubt, DSI is the 
best’ 


- William 


"One of the best purchases towards 
learning | have ever made" 


- Scott 


"| learned more than on any other 
course, or reading entire books!’ 


- Erick 


| started a bootcamop last summer 
through о well respected University, but | 
didn't learn half as much from them! 


- GA 


ООо aret [= I ae 
never Seen sucn dg eooo course eme! 
nave done plenty of them!" 


- Khatuna 


"This is a world-class Data Science 
experience. | would recommend this 
course to every aspiring or professional 
Бш ш 


- David 


dex TOI eS CIC CIN Ce Xm ray Е 
throughout the interview orocess helped 
me land my amazing new role (and ata 
much higher salary than | expected!)' 


- Barun 


DSI is a fantastic community & Andrew 
is one of the best instructors!" 


- Keith 


ИО О О ООСО ООО ЕЕ 
science related subjects are a piece of 
cake after completing tnis course! 


I'm so glad | enrolled! - Jose 


"IIT OIX BOE KO d О COMIC, 
Ey Yee e esollcenen qe e c Ree Ds! 
exeant ое Ye pz n 


- Sophie 


"The course has such high quality 
content - you get your ROI even from the 


first module" 
- Donabel 


"The Statistics 101 section was awesome! 
ОЕ how Ste oet conmilesnmes m 


ОЕШ A 
- Shrikant 


"| can't emphasise how good this 
programme is...well worth the 
investment! 


Come and join the 
hundreds & hundreds of 
other students getting tne 

results they want! 
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ү 
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httos://data-science-infinity.teachable.com 


